Updated: 2020-11-21 12:47:34 PDT

Original version created 2020-05-03. See below for revision history

Intro


The spread of the SARS-COV-19 viral disease defies description in terms of a single statistic. To be informed about personal risk we need to know more than how many people have been sick at a national level or even state level, we need information about how many people are currently sick in our communicty and how the number of sick people is changing is changing at a state and even county level. It can be hard to find this information.

This analysis seeks to fill partially that gap. It includes:
1. Several national pictures of disease trends to enable a “large pattern” view of how disease has and is evolving a on country-wide scale.
2. A per capita analysis of disease spread.
3. A more granular analysis of regions, states, and counties to shed light on local disease pattern evolution.
4. Details of the time evolution of growth statistics.


This computed document is part of a constantly evolving analysis, so please “refresh” for the latest updates. If you have suggestions or comments please reach out on twitter @WinstonOnData or facebook.


You are welcome to visit my code repository on Github.
You are also welcome to visit my analysis on the Politics of COVID
Finally, you can alway check my Rpubs for new documents and updates.

National Statistics

Total & Active Cases, and Deaths

These trend charts show the national disease statistics. Note that raw daily trends are systematically related to the M-F work week.

Mortality and \(R_e\)

Distribution of \(R_e\) Values

There is a wide distribution of \(R_e\) across regions and counties. The distributions in the graph below looks roughly symmetrical because the x-scale is logarithmic.

National Maps

State Level Data

There are several maps below. These include:

  • pandemic total cases (How many people have been sick?)
  • pandemic total cases per capita (What fraction of people have been sick?)
  • daily cases per capita (what fraction of people are getting sick?)
  • forecast short term cases per capita (based on \(R_e\)) (how fast is the disease growning or shrinking?)

Pandemic Totals

Current Status of Active Disease

Computed Reproduction Rate \(R_e\).

How many cases are there per day, per capita, in each state? You can see the number of current cases varies widely. I also include a forecast of the number of cases about a week from now given current trends. Most states currently show improvement.

Mapped County Data

While the State-Level Data Tell as remarkable story, it is also interesting to look at County-level data


state R_e cases daily_cases daily_cases_per_100k
North Dakota 1.01 70099 1351 179.60625
Wyoming 1.20 27139 970 166.71364
South Dakota 0.95 69774 1255 147.65505
Nebraska 1.16 110985 2659 139.59764
Minnesota 1.13 257258 7377 133.46340
Montana 1.19 53446 1385 133.00682
Iowa 0.97 205590 4109 131.17323
Wisconsin 1.08 365368 7539 130.46878
New Mexico 1.49 75910 2611 124.81029
Utah 1.15 169644 3598 118.17109
Kansas 1.19 136361 3402 116.95641
Rhode Island 1.59 39786 1097 103.82250
Indiana 1.13 285109 6826 102.84107
Colorado 1.15 189916 5532 100.06472
Illinois 1.02 635933 12629 98.49864
Idaho 1.13 90157 1562 92.54602
Michigan 1.15 316576 8108 81.42616
Missouri 1.07 252651 4876 80.06487
Oklahoma 1.13 167231 2915 74.39760
Kentucky 1.20 154786 3122 70.31208
Ohio 1.13 335476 8053 69.17268
Nevada 1.17 130017 1962 67.14882
Tennessee 1.05 326145 4306 64.74128
Connecticut 1.26 100880 2234 62.37603
Arkansas 1.09 139173 1752 58.58217
West Virginia 1.26 38449 1071 58.55486
Louisiana 1.15 215452 2677 57.40181
Pennsylvania 1.24 300708 6650 51.98894
Arizona 1.28 291868 3520 50.67165
Alabama 1.11 228123 2254 46.33398
New Jersey 1.12 298621 4002 45.05821
Texas 1.13 1151453 12497 44.81750
Mississippi 1.13 140070 1278 42.76018
Delaware 1.17 30576 404 42.54893
Florida 1.21 919519 8002 38.84817
Maryland 1.22 177030 2305 38.39469
Massachusetts 1.07 193924 2492 36.48506
North Carolina 1.18 329273 3531 34.76891
New Hampshire 1.29 16444 436 32.44960
South Carolina 1.11 202869 1605 32.38548
Washington 1.15 147325 2354 32.27161
California 1.25 1093808 11978 30.59612
Georgia 1.13 421748 3105 30.15300
Oregon 1.13 61994 1107 27.12892
New York 1.11 589185 5113 26.06220
Virginia 1.21 167263 1737 24.93845
Vermont 1.41 3434 123 19.68072
Maine 1.12 9946 209 15.68112

Regional Snapshots

Regional snapshots reveal the highly nuanced behavior of disease spread. Each snaphot includes multiple states and selected counties.

How to read the charts

There are four components:
1. State Maps show the number of active cases and with the Reproduction rate encoded as color.
2. State Graphs State-wide trend graphs.
3. Severity Ranking These is a table of counties where the highest number of new cases are expected. Severity is a compounded function \(f(R, cases(t))\). This is useful for finding new (often unexpected) “hot spots.” Added per capita rates.
4. County Graphs encode the R-value in the active number of cases. R is the Reproduction Rate.

(NOTE: R < 1 implies a shrinking number of active cases, R > 1 implies a growing number of active cases. For R = 1, active cases are stable. ).


Washington and Oregon

California

Four Corners

Mid-Atlantic

Deep South

FL and GA

Texas & Oklahoma

Michigan & Wisconsin

Minnesota, North Dakota, and South Dakota

Connecticut, Massachusetts, and Rhode Island

New York

Vermont, New Hampshire, and Maine

Carolinas

North-Rockies

Midwest

Tennessee and Kentucky

Missouri and Arkansas

Conclusions

It’s in control some places, but not all places. And many places are completely out-of-control.

Stay Safe!
Be Diligent!
…and PLEASE WEAR A MASK



Built with R Version 4.0.3
This document took 264.8 seconds to compute.
2020-11-21 12:51:59

version history

Today is 2020-11-21.
185 days ago: plots of multiple states.
177 days ago: include \(R_e\) computation.
174 days ago: created color coding for \(R_e\) plots.
169 days ago: reduced \(t_d\) from 14 to 12 days. 14 was the upper range of what most people are using. Wanted slightly higher bandwidth.
169 days ago: “persistence” time evolution.
162 days ago: “In control” mapping.
162 days ago: “Severity” tables to county analysis. Severity is computed from the number of new cases expected at current \(R_e\) for 6 days in the future. It does not trend \(R_e\), which could be a future enhancement.
154 days ago: Added census API functionality to compute per capita infection rates. Reduced spline spar = 0.65.
149 days ago: Added Per Capita US Map.
147 days ago: Deprecated national map. can be found here.
143 days ago: added state “Hot 10” analysis.
138 days ago: cleaned up county analysis to show cases and actual data. Moved “Hot 10” analysis to separate web page. Moved “Hot 10” here.
136 days ago: added per capita disease and mortality to state-level analysis.
124 days ago: changed to county boundaries on national map for per capita disease.
119 days ago: corrected factor of two error in death trend data.
115 days ago: removed “contained and uncontained” analysis, replacing it with county level control map.
110 days ago: added county level “baseline control” and \(R_e\) maps.
106 days ago: fixed normalization error on total disease stats plot.
99 days ago: Corrected some text matching in generating county level plots of \(R_e\).
93 days ago: adapted knot spacing for spline.
79 days ago:using separate knot spacing for spline fits of deaths and cases.
77 days ago: MAJOR UPDATE. Moved things around. Added per capita severity map.
49 days ago: improved national trends with per capita analysis.
48 days ago: added county level per capita daily cases map. testing new color scheme.
21 days ago: changed to daily mortaility tracking from ratio of overall totals.
14 days ago: added trend line to state charts.

Appendix: Methods

Disease data are sourced from the NYTimes Github Repo. Population data are sourced from the US Census census.gov

Case growth is assumed to follow a linear-partial differential equation. This type of model is useful in populations where there is still very low immunity and high susceptibility.

\[\frac{\partial}{\partial t} cases(t, t_d) = a \times cases(t, t_d) \] \(cases(t)\) is the number of active cases at \(t\) dependent on recent history, \(t_d\). The constant \(a\) and has units of \(time^{-1}\) and is typically computed on a daily basis

Solution results are often expressed in terms of the Effective Reproduction Rate \(R_e\), where \[a \space = \space ln(R_e).\]

\(R_e\) has a simple interpretation; when \(R_e \space > \space 1\) the number of \(cases(t)\) increases (exponentially) while when \(R_e \space < \space 1\) the number of \(cases(t)\) decreases.

Practically, computing \(a\) can be extremely complicated, depending on how functionally it is related to history \(t_d\). And guessing functional forms can be as much art as science. To avoid that, let’s keep things simple…

Assuming a straight-forward flat time of latent infection \(t_d\) = 12 days, with \[f(t) = \int_{t - t_d}^{t}cases(t')\; dt' ,\] \(R_e\) reduces to a simple computation

\[R_e(t) = \frac{cases(t)}{\int_{t - t_d}^{t}cases(t')\; dt'} \times t_d .\]

Typical range of \(t_d\) range \(7 \geq t_d \geq 14\). The only other numerical treatment is, in order to reduce noise the data, I smooth case data with a reticulated spline to compute derivatives.


DISCLAIMER: Results are for entertainment purposes only. Please consult local authorities for official data and forecasts.